This folder contains the files and code needed to generate the second two rows (middle and bottom) of figure 7 in the paper.

The file baummerge2.csv is the raw data for use in R.

The file impbaum.r reads in the raw data, transforms it to the variables of interest and removes/formats some problems 
in the raw data, and then creates 100 imputed datasets.

Because this is a large dataset, imputation is slow, and was conducted in the Condor, a batch server in the Research 
Computing Environment at Harvard MIT Data Center.

Imputed datasets are named "hhABCDE.csv" where ABCDE is some random digit number.  These names are generated so that imputed
datasets resulting from jobs run in parallel by Condor, which are dumped into a fixed location do not overwrite each other.

One hundred such datasets are then renamed h1.csv through h100.csv. The first ten of these are exactly duplicated in the 
folder "table1" where the replication and reanalysis after imputation of the results in Baum and Lake (2003) are conducted.

The file figure7b plots the distribution of imputed values for the missing elements against the observed data in the 
dependent variable of Female Life Expectancy.

The file figure7c plots the distribution of imputed values for the missing elements against the observed data in the 
dependent variable of Female Secondary Schooling.  

See also the folder "table1" for the replication and reanalysis after imputation of the Baum and Lake results.
